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Abstract 



We consider the problem of computing ratings using the results of 
games played between a set of n players, and show how this problem can 
be reduced to computing the positive eigenvectors corresponding to the 
dominant eigenvalues of certain nxn matrices. There is a close connection 
^ ' with the stationary probability distributions of certain Markov chains. In 

Cn , practice, if n is large, then the matrices involved will be sparse, and the 

power method may be used to solve the eigenvalue problems efficiently. 
We give an algorithm based on the power method, and also derive the 
same algorithm by an independent method. 
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. 1 Introduction 



Suppose that n players, numbered 1, . . . , n, play a total of m games which may 
end in a win, loss or draw. We assume that a win scores 1 point, a draw 0.5, 
and a loss 0. The results can be summarised by an n x n score matrix S = {sij), 
C^ ■ where Sij is the number of points that player i scores against player j (i ^ j). 

The diagonal entries s*^ are arbitrary; for reasons discussed later we assume 
that Si.i = a, where a > is a constant. For the moment, the reader may 
assume that a = 0. 

The aim is to assign ratings r, to the players in such a way that players who 
perform better obtain higher ratings. The problem may arise when S represents 
the results obtained in a single event, e.g. a Swiss tournament, and typically 
we want to use the ratings to break ties between players on equal scores. It 
may also arise when S represents all the recent results involving a large set of 
players, for example in the regular updates to a national rating system. In the 
latter case, we can expect the matrix S to be large and sparse. 
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The expected score of one player when playing against another should depend 
only on the difference of their ratings. Thus, we assume that, if the ratings are 
correct, then the expected score of player i in a game against player j is /(r,— Tj), 
for some function / -.TZ —> [0, 1]. Since the total score obtained by both players 
in a game is 1, the function / should satisfy the condition 

f(z) + f(-z) = l. (1) 

It is reasonable to assume that f(z) is monotonic increasing, and that 

0= lim f(y)<f(z)< lim f(y) = l. (2) 

y— > — oo y—^+oo 

It is easy to find functions satisfying these conditions. For example, if <fr(x) 
is a probability density on (— oo,+oo), satisfying the condition <f>(x) > and 

<fi(x) = <j)(—x), then 



f( z ) = / (f){x)dx 

J —oo 

satisfies (JTJ) and ^. We could take the normal probability density 

<t>{x) = - / =exp(-x 2 /2), 



but there seems to be no real justification for this choice, and we shall make a 
more convenient choice below. 

Let g(z) — f(z)/(l — f(z)). For a game without draws, g(n — rj) is the ratio 
Pro6(player i wins)/Pro6(player j wins). From (jTJ , g(z) = f(z)/f(—z) and 

g(z)g(-z) = 1 . (3) 

A simple solution to Q is g(z) = e cz for some constant c. Since f(z) — 
g(z)/(l +g{z)), this suggests taking 

/(*) = tt^i ■ ( 4 ) 

It is easy to check that / satisfies (jTJ) and © if c > 0. 

We shall assume that f(z) has the form Q. This involves an empirical 
assumption that could be tested by experiment. For example, given three players 
{1,2,3} and ratings ri,r2,r^ such that player 1 has expected score f(r\ — r%) 
against player 2, and player 2 has expected score /(r 2 — r 3 ) against player 3, is it 
true that player 1 has expected score f(ri — ra) against player 3? The outcome of 
an experiment to test this hypothesis might depend on the particular players and 
what game they are playing. In practice our assumptions are computationally 
convenient and we expect that they will be a reasonable approximation to the 
truth. 

There is some evidence [I] that the choice (£[]) of f(z) (corresponding to the 
Logistic distribution) gives a better approximation to reality, at least for chess, 



than the choice based on the normal distribution, as originally proposed by 
Elo [6]. Essentially ((4]) is used in the current USCF rating system [7]. However, 
in this paper our choice of (U) is made primarily for computational convenience, 
and because it leads to an elegant algorithm. 

By scaling the ratings, we can assume that c = 1. Thus, in the following we 
assume that 

/(*) = T~^ • (5) 

To avoid the exponential function, it is convenient to define 

x t = exp(ri) . 

Thus, player i has expected score 

1 Xi 



in a game against player j. 

The ratings r^ are given by r, : = In x t , but we can add an arbitrary constant 
j3 to all the r^, since only their differences are significant (this corresponds to 
multiplying all the Xi by a positive constant k = exp(/3)). 

If i ^ j, the total number of games played between player i and player j is 

9i,j ^ij ' &j,i ■ 

In the case i = j, we use this as a definition of g^j, that is g+j = 2sij = la . 

Let W = (wij) be a symmetric matrix of positive weights Wij. Games 
between players i and j will be weighted in proportion to Wij. 

The actual weighted score of player i is 

n 
Si — y ^i^jWi^j 

i=i 
and, given the players' ratings, the expected weighted score is 

e < = g (^t) «^«j ■ 

If the only information available on the players' strengths is the results en- 
coded in the matrix S, then it is reasonable to choose ratings such that the 
expected and actual weighted scores of each player are the same, that is 

ei — Si for i = 1, . . . , n . 

Using the definitions of e^, Si and gij, this condition is 



Yl ( f ) Wl <3 ( S '.J + S 3,i) = Y S i,i W iJ for * = 1, • • • , '« 



(6) 



2 Choosing Weights to Give a Linear Problem 

The system of equations (|6]) is nonlinear in the unknowns Xi, and it is not 
immediately obvious that it has a solution, or how such a solution should be 
found. It is obvious that a solution is not unique, because if x = (xi) is one 
solution then so is kx for any positive constant k. 
In order to obtain a linear problem, we choose 

wi,j = (Xi +x j )u l .j (7) 

for some symmetric positive matrix U . 

With the choice ||7J) of weights, equation (|6]) reduces to 

n n 

3 = 1 3=1 

Using the symmetry of U , this simplifies to 






>i2^ a 3' 1 =Z^ a i,3 X J i ( 8 ) 



where Ojj = s^jUij. The matrix A = (a^j) is a weighted version of the score 
matrix S, and has the same sparsity pattern as S. 
Let 

n 
3 = 1 

di can be interpreted as the (weighted) numher of points lost by player i, that 
is the (weighted) number of points scored by player j's opponents in the games 
they played against i. 

If a player does not lose any points, then the data is insufficient to determine 
a finite rating for him - he is "infinitely good" . Thus, we assume that di > 
for 1 < i < n. 

Let D = diag(di) be the diagonal matrix with diagonal elements di. By our 
assumption, D is nonsingular. 

3 The Eigenvalue Problem 

The condition ([5]) can be written in matrix-vector form as 

Dx = Ax (9) 

or 

D- 1 Ax = x. (10) 

Thus, the solution vector x is the eigenvector corresponding to the eigenvalue 1 
of the matrix D~ l A. 



We observe that the matrix A — D has linearly dependent rows; in fact, it is 
easy to see from the definition of D that the rows of A — D sum to zero. Thus, 
A — D is singular, so D~ x A — I is singular, and D~ x A does in fact have an 
eigenvalue 1. Similarly for AD -1 . 

Equation §§§ can be interpreted in terms of a Markov chain. Let y = Dx and 
M T = AD~ X . Then M is the transition matrix of a Markov chain (rriij > and 
J2j m i,3 = !)■ The vector y/||j/||i gives the stationary probability distribution, 
because y T M = y T , or equivalently AD~ l y = y. It follows from standard 
theory of Markov matrices that p(D^ 1 A) = p(AD _1 ) < 1. 

In certain degenerate cases we can not expect finite ratings to be defined by 
the data. We already assumed that d% > 0. This is necessary, but not sufficient. 
If the players can be split into two disjoint nonempty sets such that players in 
the first set always beat the players in the second set, then the players in the 
first set are "infinitely better" than the players in the second set. Similarly, if 
players in the first set never play players in the second set, we can not expect 
to compare their playing strengths. In practice, in either of these situations, we 
could split the problem and rate players in each set separately. 

In the typical nondegenerate case, D^ 1 A has a simple eigenvalue Ai = 1, 
and the other eigenvalues Aj are inside the unit circle, that is |Aj| < 1 except 
for Ai = 1. Then the power method converges and we can find x by the simple 

iteration 

x (k+i) = D -i Ax (k) ] 

with a suitable starting vector, e.g. x^ =(1,1,..., 1) T . 

So far we did not mention the role of the constant a (recall that Si.i — a). The 
solution x of (fTUl) is independent of a, but the speed of convergence of the power 
method depends on a. We have found in our experiments that a £ [0.2, 0.5] is a 
good choice to maximise the speed of convergence. Any a > will ensure that 
D is nonsingular. 

4 Modifying the Weights 

We have seen that solving an eigenvalue problem allows us to compute ratings if 
the score matrix is weighted by the weight function ([7]). It would be more natural 
to solve the problem with unit weights, that is Wi : j — 1. Unit weights have the 
advantage that, when applied to a round-robin ("all play all") tournament, 
players with the same score obtain the same ratings, as is easy to see from ©. 
The condition Wij = 1 is equivalent to 

1 



Thus, we can regard the solution of the eigenvalue problem (fTU| as an inner 
iteration, and introduce an outer iteration where we change the weights. If x^ k > 
is the solution to the fc-th eigenvalue problem (with x^ = (1,1,..., 1) T an 



initial vector), then the (k + l)-st eigenvalue problem will use 



(fe+i) 1 



*>J (fe) . (fc) ' 

»i + a;} 
If the outer iteration converges, then it solves the original problem with weights 

In practice, we have found that convergence is quite rapid in nondegenerate 
cases. However, it is wasteful to solve the inner eigenvalue problems accurately. 
It is much more efficient to perform just one iteration of the power method 
in the inner loop. The resulting Algorithm 1 (without improvements to take 
advantage of sparsity) is given below. 

for i := 1 . .n do 
x[i] := 1.0; 
end for; 
for k := 1, 2, ... until convergence do 
for i := l..n do 
d[i] := 0.0; 
end for; 
for i := 1 . .n do 
sum : = 0.0; 
f or j := l..n do 

temp := s[i,j]/(x[i] + x[j]); 
sum := sum + temp*x[j] ; 
d[j] := d[j] + temp; 
end for; 
y[i] := sum; 
end for; 
for i := 1 . .n do 
x[i] := y[i]/d[i]j 
end for; 
end for; 

Algorithm 1: Unit Weights 



Since the aim is to compute ratings r, = lnxj, the convergence test should 
ensure a small relative error in each component of x. Thus, an appropriate 
stopping criterion is 

„(*0 Jfc-i) 



max 

Ki<n 



x- 



(k) 
x) 



< £ 



where e is a tolerance depending on the accuracy required. 

Failure to converge in a reasonable number of iterations may indicate that 
the problem is degenerate and that some ratings are diverging to ±oo. In this 



case one or more players should in principal be excluded from consideration. A 
more convenient solution in practice is to add a "dummy" player who draws 
with all the other players, and whose games are given a positive weight 7, for 
example 7 = 1. As 7 — > 0+ the ratings tend to the correct values (possibly ±00), 
but for any positive 7 we obtain a nondegenerate problem and finite rating^]. 

If n is large then S (and hence A) will be sparse, since there are at most 
two off-diagonal entries for each game played. Thus, the number of nonzero 
elements is at most 2m + n. The inner loop of the algorithm above essentially 
involves the multiplication of A on the right by x, and on the left by [1, 1, . . . , 1]. 
Thus, standard sparse matrix techniques can be used to reduce the complexity 
of the inner iteration from 0(n 2 ) to 0(m). 

In practice the final ratings would be modified by a linear transformation to 
make them positive and not too small, before rounding to the nearest integer. 
(FIDE Elo ratings are usually in the range [0, 3000], and BCF ratings are usually 
in the range [0,300], see |5].) 

5 An Independent Derivation of Algorithm 1 

From ([6]) with Wi t j — 1 we have, for i = 1, . . . , n, 






(11) 



This simplifies to 



Ei 3)1 \ y 3 hj 

ftp . I ryi . ' j ff . I ry - 

J=l J 3 = 1 J 



and, taking Xi outside the sum on the left, we see that 



A natural iteration to solve (|I2I) is 




(12) 




(k) 
^ Jk) , (k) / 2^ (k) ,' (k) ( 13 ) 



r (fe) , r (fc) / ^ T ( fe ) I r ( fe ) 



for fe = 1,2,.... However, it is easy to see that this is exactly the iteration 
implemented in Algorithm 1! 

The significance of the diagonal terms s^j = a > is apparent if we consider 
the diagonal terms (j = i) in the numerator and denominator of (|13p . The 



1 This solution is similar to the one adopted in the PageRank algorithm used by the Google 
search engine [2l |10l!T l"|, where a fictional page essentially has links to every other page. Our 
parameter 7 corresponds to the Page Rank algorithm's 1 — d. 



diagonal term in the numerator is a/2 and the diagonal term in the denominator 
is a/{2x\ ). As a -> oo the right-hand side of ([i~3)) — > x\ . Thus, a acts as a 
damping factor: larger values of a tend to reduce the change x\ ' — > x\ at 
each iteration. 

Other iterations can be obtained in a similar manner. For example, taking 
Xi outside the sum on the left side of (jlip . we obtain the iteration 



*f +1) =l>>.-| / 1 > ; =% ' " 3 ;l 1 ■ (14) 




However, our numerical experiments suggest that (|14[) gives slower convergence 
than (|13[) . This conclusion is confirmed by a first-order analysis in the special 
case that all the Xi are approximately equal. 

6 Incorporating Old Ratings 

Often some or all of the players will already have ratings, say player i has rating 
Xi based on uii gameaj. It is easy to take such "old" ratings into account by 
a slight modification of the argument leading to equation ©. We add tUj/2 to 
the actual weighted score Si, as if player i drew Wi games against a player with 
rating $i, and also add WiXi/(xi + Xi) to the expected weighted score e*. Thus 
equation (j6]) becomes 

J2[^Tzr:) Wi ^ s ^ + s ^ + ^r^ = J2 S ^ W ^ + t (15) 

for i = 1, . . . , n, and equation (|11[) becomes 



Now, it is easy to see that the iteration (1131) becomes 



£ 



x 



(fc+1) _ \j=i A' +*)' ) 2(^ + an 



(17) 



'j/' 



2-^i ' (k) , (fc) I „, Ik) . 

J=1 x\ ' +x) > I 2{x\ +Xi) 



for A; = 1,2, 



2 The weight {«i associated with an old rating might be reduced by a constant factor, say 
0.5, to give less weight to old games than to recent ones. 



7 Conclusion 

We have shown how the computation of ratings for players of chess (and other 
games) can be reduced to an eigenvalue problem, or a sequence of eigenvalue 
problems, and that these eigenvalue problems are closely related to the problem 
of computing the stationary probability distributions of certain Markov chains. 
The eigenvalue problems can be solved efficiently by the power method. We 
derived an algorithm (Algorithm 1) by two different methods; one derivation 
(§93] HJ) used the power method, the other (Sj5]) was from first principles. 

We have tested Algorithm 1 on small examples and on some larger examples 
with simulated scores. It would be interesting to test the algorithm on real data 
and to see how it compares with the methods currently in use for chess [9l[IJ[6j[7] 
and other games, e.g. football [3]. Because they are practical systems that have 
evolved over time, most of these methods involve various ad hoc features and 
piecewise linear approximations, so they are less elegant (though perhaps more 
practical) than the relatively simple algorithm discussed here. 
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